Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize LazyBranch range getindex and return VoV #122

Merged
merged 1 commit into from
Oct 4, 2021

Conversation

aminnj
Copy link
Member

@aminnj aminnj commented Oct 1, 2021

Closes #113

With this PR, Base.getindex(ba::LazyBranch{T,J,B}, range::UnitRange) will read the right baskets, concatenate them, and chop off the edges to avoid the accumulated cost of many individual getindexs in [ba[i] for i in range] (master). And, the return value is now a VectorOfVectors if it's a jagged branch.

I'm using a zlib test file. The 40*10^6:43*10^6] range has 5 baskets, while the 1:10^6 range has 450 baskets. The improvement is more obvious for the first range.

Out of curiosity I tested append!() and found it's basically the same as vcat.

julia> using UnROOT ; const tf = LazyTree(ROOTFile("Run2012BC_DoubleMuParked_Muons.root"),"Events");

master

[ba[i] for i in range]
julia> @btime tf.Muon_pt[40*10^6:43*10^6]
  460.809 ms (217 allocations: 403.03 MiB)
3000001-element Vector{SubArray{Float32, 1, Vector{Float32}, Tuple{UnitRange{Int64}}, true}}:

julia> @btime tf.Muon_pt[1:10^6]
  172.634 ms (18903 allocations: 154.38 MiB)
1000000-element Vector{SubArray{Float32, 1, Vector{Float32}, Tuple{UnitRange{Int64}}, true}}:

by basket with vcat

vcat((basketarray(ba, i) for i in ib1:ib2)...)
julia> @btime tf.Muon_pt[40*10^6:43*10^6]
  416.985 ms (3251 allocations: 392.10 MiB)
3000001-element VectorOfVectors{Float32, Vector{Float32}, Vector{Int32}, Vector{Tuple{}}}:

julia> @btime tf.Muon_pt[1:10^6]
  158.825 ms (21193 allocations: 152.86 MiB)
1000000-element VectorOfVectors{Float32, Vector{Float32}, Vector{Int32}, Vector{Tuple{}}}:

by basket with append

out = basketarray(ba, ib1)
for ib in (ib1+1):ib2
  append!(out, basketarray(ba, ib))
end
julia> @btime tf.Muon_pt[40*10^6:43*10^6]
  422.956 ms (3236 allocations: 357.87 MiB)
3000001-element VectorOfVectors{Float32, Vector{Float32}, Vector{Int32}, Vector{Tuple{}}}:

julia> @btime tf.Muon_pt[1:10^6]
  161.536 ms (20294 allocations: 154.91 MiB)
1000000-element VectorOfVectors{Float32, Vector{Float32}, Vector{Int32}, Vector{Tuple{}}}:

@codecov
Copy link

codecov bot commented Oct 1, 2021

Codecov Report

Merging #122 (f106d33) into master (d98fe87) will increase coverage by 0.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #122      +/-   ##
==========================================
+ Coverage   92.61%   92.63%   +0.02%     
==========================================
  Files          11       11              
  Lines        1381     1385       +4     
==========================================
+ Hits         1279     1283       +4     
  Misses        102      102              
Impacted Files Coverage Δ
src/iteration.jl 90.68% <100.00%> (+0.23%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d98fe87...f106d33. Read the comment docs.

@tamasgal tamasgal merged commit e612c64 into JuliaHEP:master Oct 4, 2021
Moelf pushed a commit to aminnj/UnROOT.jl that referenced this pull request Jun 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize lazytree[range]
2 participants